Compensating Gender Variability in Query-by-Example Search on Speech Using Voice Conversion
نویسندگان
چکیده
The huge amount of available spoken documents has raised the need for tools to perform automatic searches within large audio databases. These collections usually consist of documents with a great variability regarding speaker, language or recording channel, among others. Reducing this variability would boost the performance of query-by-example search on speech systems, especially in zero-resource systems that use acoustic features for audio representation. Hence, in this work, a technique to compensate the variability caused by speaker gender is proposed. Given a data collection composed of documents spoken by both male and female voices, every time a spoken query has to be searched, an alternative version of the query on its opposite gender is generated using voice conversion. After that, the female version of the query is used to search within documents spoken by females and vice versa. Experimental validation of the proposed strategy shows an improvement of search on speech performance caused by the reduction of gender variability.
منابع مشابه
Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems
This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...
متن کاملCodec integrated voice conversion for embedded speech synthesis
Voice conversion technologies transform individual characteristics of speech patterns while preserving the original content, and can be widely used in speech processing. Considering limited system resources, in particular, of embedded concatenative speech synthesis, voice conversion may reduce the memory consumption of the acoustic database. Voice conversion enables the intra-gender or cross-ge...
متن کاملVoice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملNovel method for data clustering and mode selection with application in voice conversion
Since the statistical properties of speech signals are variable and depend heavily on the content, it is hard to design speech processing techniques that would perform well on all inputs. For example, in voice conversion, where the aim is to transform the speech uttered by a source speaker to sound as if it was spoken by a target speaker, different types of interspeaker relationships can be fou...
متن کاملطراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی
Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...
متن کامل